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Method fox describing the composition of audio signal 



The invention relates to a method for describing the 
composition of audio signals, especially for 
the spatialization of MPEG-4 encoded audio signals in a 3D 
domain . 



10 



15 



The MPEG-4 Audio standard as defined in XSO/IEC 14496-3 and 
14496-1 facilitates a wide variety of applications by sup- 
porting the representation of audio objects. For the combi- 
nation of the audio objects additional information - the so- 
called scene description - determines the placement in space 
and time and is transmitted together with the coded audio 
objects * 



A scene description is structured hierarchically and can be 
represented as a graph, wherein leaf -nodes of the graph form 
the separate objects and the other nodes describes the proc- 

20 essing, e.g. positioning, scaling, effects. A node named 

"Sound" allows spatialization of the audio signal in a 3D • 
domain, A further node with the name ,, Sound2D 1 ' only allows 
spatialization on a 2D screen. The use of the "Sound" node 
in a 2D graphical player is not specified due to different 

25 implementations of the properties in a 2D and 3D player* 



The Sound2D node is defined as followed: 

* 

Sound2D { 

30 exposedField SPPloat intensity 1-0 

exposedField SFVec2f location 0,0 

exposedField SFNode source NULL 

field *' SFBool spatialize TRUE 

} 

35 

and the Sound node is defined as followed: 
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Sound { 

e:xposedField 
exposedField 
exposedField 
expo s edPi el d 
exposedField 
exposedField 
exposedField 
exposedField 
exposedField 
field SFBool 

} 



SFVec3f 
SFFloat 
SFVec3f 

SFFloat 
SFFloat 



SFFloat 
SFNode 



direction 0, 0, 
intensity 1 , 0 
location 0, 0, o 
maxBack 10.0 
maxFront 10 . 0 



SFFloat minBack 1 . 0 
SFFloat minFront 1 . 0 



priority 0 . 0 
source NULL 



specialize TRUE 



15 in the following the general term for all sound nodes 
(So-und2D, sound and Direct iveSound) will be written in 
lower-case e.g. ^sound nodes r - 



20 



# 

Please pay attention to the fact that the Sound node is ac 
tually a 3D node. 



In the simplest case the Sound or Sound2D node is connected 
via an AudioSource node to the decoder output . The sound 
nodes contain the intensity and the location information. 
25 From the audio point of view a sound node is the final node 
before the loudspeaker mapping. In the case of several sound 
nodes # the output will be summed up. From the systems point 
of view the sound nodes can be seen as an entry point for 
the audio sub graph. A sound node can be grouped with non- 
30 audio nodes into a Transform node that will set its original 
location. 



With the phaseGroup field of the AudioSource node, it is 
possible to mark channels that contain important phase rela- 
tions, like in the case of "stereo pair", "multichannel" 
etc. A mixed operation of phase related channels and non- 
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phase related channels is allowed. A spztialize field in the 
sound nodes specifies whether the sound shall be spatialized 
or not. This is only true for channels, which are not member 
of a phase group . 

5 

The Sound2D can spatialise the sound on the 2D screen. The 
standard said that the sound should be spatialized on scene 
of size 2m x 1.5m in a distance of one meter. This explana- 
tion seems to be ineffective because the value of the loca- 
10 tion field is not restricted and therefore the sound can 
also be positioned outside the screen size. We suppose to 
remove the confusing description. 

The Sound and Direct iveSound node can set the location eve- 
is rywhere in the 3D space. The mapping to the existing loud- 
speaker placement can be done using simple amplitude panning 
or more sophisticated techniques. 

Both Sound and Sound2D can handle multichannel inputs and 
20 basically have the same functionalities, but the Sound2D 
node cannot spa.tia.lize a sound other than to the front. 

Prom games, cinema and TV applications we know, that it 
makes sense to provide the end user with a fully spatialized 
25 11 3D- Sound 11 presentation, even if the video presentation is 
limited to a small flat screen in front - 

A first proposal is to add Sound and Sound2D to all scene 
graph profiles, i.e. add the Sound node to the SF2DNode 
3 o group - 

But, one reason for not including the "3D" sound nodes into 
the 2D scene graph profiles is, that a typical 2D player is 
not capable to handle 3D vectors (SFVec3f type) , as it would 
35 be required for the Sound direction and location field. 
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Another reason is that the Sound node is specially designed 
for virtual reality scenes with moving listening points and 
attenuation attributes for far distance sound objects. For 
' this the Listening point node and the Sound maxB&ck, max- 
Front, mxiiBsLck and minFront fields are defined. 

The proposal is to extend the old Sound2D node or to define 
a new Sound2Ddepth node. The Sound2Ddepth node should be 
similar the Sound2D node but with an additional depth field, 

Sound2Ddepth { 

exposedField SFFloat intensity 1 . 0 

exposedPield SFVec2f locationO, 0 

exposedField SFFloat depth 0 . 0 

15 exposedField SFNode source NULL 

field SFBool spatialize TRUE 

} 
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20 



25 



The Intensity field adjusts the loudness of the sound, Its 
value ranges from 0.0 to 1.0, and this value specifies a 
factor that is used during the playback of the sound 

* 

The location field specifies the location of the sound in 
the 2D scene . 

* 

The depth field specifies the depth of the sound in the 2D 
scene using the same coordinate system than the location 
field. The default value is 0.0 and it refers to the screen 
position. 

* 

The Bpatxallzs field specifies whether the sound shall be 
spatialized. If this flag is set, the sound shall be spati- 
alized with the maximum sophistication possible. 

35 The same rules for multichannel audio spat ializat ion apply 
to the Sound2Ddepth node as to the Sound (3D) node. 



30 
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Using the Sound2D node in a 2D scene allows presenting sur- 
round sound, as the author recorded it. It is not possible 
to spaticilize a sound other than to the front. Spatialize 
5 means moving the location of a monophanic signal due to user 
interactivities or scene updates „ 

With the new proposal of a Sound2Ddepth node it is possible 
to spatiali^e a sound also in the back, at the side or above 
of the listener. Supposing the audio presentation system has 
the capability to present it. 

The invention allows the spatialization of the audio signal 
in a 3D domain/ even if the player is restricted to 2D 
graphics . 

# 

A further embodiment is as follows: 

20 The audio nodes "Sound 1 , > Sound2D' and ' direct iveSound 1 de- 
scribed in the MPEG- 4 audio standard have besides the func- 
tionality to present phase-related multichannel signals 
(>ichannel> , the functionality to present accumulated mono- 
phonic sounds everywhere in the 2D or 3D space. Their posi- 
25 tion in these spaces can be addressed by using the 1 loca- 
tion' -field of these nodes. The location is a 2D- or 3D- 
vector. If the 1 spatialize ! -field of these nodes is set to 
'true 1 , the sound will be spatialized depending on the 'lo- 
cation 1 . For the spatialization process different algorithms 
30 can be used, for example the amplitude panning. 

With these techniques it is only possible to position a 
sound as a point sound source. The sound has no 'width 1 . 

Different methods are actually discussed to position a sound 
35 in a virtual reality world space, with a noticeable width. 
The approach of these algorithms is to describe a size or a 
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shape in a 3D coordinate system. 

There has been no approach for a user- oriented system. The 
following solution is to insert two new fields into the 
Sound and Sound2D nodes to control the width of the sound 
5 with an opening-angle relative to the listener; The angle 
has a vertical and a horizontal component/ ' widthHorizontal 1 
and 'widthVertical' , ranging from 0.,.2tt with the location 
as its center. The widthHorizontal is shown in Pig. 3. The 
widthVertical is similar to this with a 90 degree x-y- • 
xo rotated relation. 

The width of a sound can be generated for example as de- 
scribed in DE - A - 196 32 734 'methods and apparatus for 
generating a multichannel signal from a mono signal 1 . 

An application is for example the monbphonic transmission of 
the violins in an orchestra instead its transmission as ste~ . 
reo signals, coupled- to a fixed loudspeaker layout and no 
possibility to position it at a desired location. 
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* » 

Claims 

1. Method for describing the composition of audio signals, 
which are encoded as separate audio objects/ wherein 

s the arrangement and the processing of the audio objects 

in a sound scene is described by nodes arranged hierar- 
chically in a scene description, wherein a node allows 
spat ializat ion on a 2D screen using a 2D vector, 
characterized by describing a 3D position of an audio 
10 object using said 2D vector and a ID value describing 

the depth of said audio object. 

2. Method according to claim 1, characterized by 
controlling the- width of an audio object by using two 

is additional fields inserted into the Sound and Sound2D 

nodes describing an opening-angle relative to the lis- 

9 

tener . 
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Abstract 

Method for describing the composition of audio signals, 
5 which are encoded as separate audio objects. The arrangement 
and the processing of the audio objects in a sound scene is 
described by nodes arranged hierarchically in a scene de- 
scription. A node specified only for spatialization on a 2D 

ft* 

screen using a 2D vector describes a 3D position of an audio 
io object using said 2D vector and a ID value describing the 
depth of said audio object. 

In a further embodiment two new fields are inserted into the 
Sound and Sound2D nodes to control the width of the sound 
15 with an opening -angle relative to the listener* 
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